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Abstract. 

We discuss in details a modified variational matrix-product-statc algorithm for 
periodic boundary conditions, based on a recent work by P. Pippan, S.R. White and 
H.G. Everts, Phys. Rev. B 81 (2010) 081103(R), which enables one to study large 
systems on a ring (composed of N ~ 10 2 sites). In particular, we introduce a couple of 
improvements that allow to enhance the algorithm in terms of stability and reliability. 
We employ such method to compute the stiffness of one-dimensional strongly correlated 
quantum lattice systems. The accuracy of our calculations is tested in the exactly 
solvable spin-1/2 Heisenberg chain. 



1. Introduction 

The recent technological advances in manipulating cold atomic gases with very high 
control and accuracy have opened up the possibility to test the quantum physics of 
many-particle systems in its roots pQ. Moreover, quantum many-body systems can 
sustain collective states of matter which have no classical analog, such as superfluid or 
insulating quantum phases [2], or even more exotic states such as the supersolid [3]. 
Experimental achievements along these direction renewed, in parallel, a considerable 
theoretical interest in the study of strongly correlated systems. Unfortunately, despite 
the apparent simplicity of their constituting Hamiltonian, the lack of a dominant exactly 
solvable contribution ultimately limits the applicability of conventional perturbative 
methods, thus drastically restricting analytic studies of both statics and non-equilibrium 
dynamical properties to very few cases. Approximated techniques or fully numerical 
approaches are therefore often required. 

In the early Nineties, White proposed a very powerful and accurate algorithm 
for numerically performing the renormalization group procedure in one-dimensional 
(ID) systems [4]. Its large applicability to the study of both static and dynamic 
properties of the low-energy spectrum of generic strongly correlated ID quantum 
systems, stimulated a considerable part of condensed matter theorists, which, in the 
subsequent years, produced a number of relevant results lying on this method, also called 
the Density Matrix Renormalization Group (DMRG) (see, e.g., Ref. [5] and references 
therein). The formal equivalence of the DMRG procedure with a Matrix Product State 
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(MPS) [B], which can be written in terms of a product of certain matrices, permitted a 
reformulation of the DMRG method as an optimization algorithm. This allowed a better 
understanding of the renormalization procedure, together with a number of promising 
extensions and improvements of the method, mostly coming from quantum information 
communities. Nonetheless, unlike the original DMRG protocol, these new algorithms are 
still not of common use and their potentialities have not been completely exploited [7]. 

One of the drawbacks of the celebrated DMRG algorithm, in its canonical 
formulation, is its intrinsic difficulty in simulating ground states of ID systems with 
periodic boundary conditions. In particular, the accuracy scales like the square of the 
number of states kept in an analogous simulation with open boundary conditions [3J. For 
this reason, its applicability to systems closed on a ring is limited to very few examples 
in the literature [8]. Much better performances can be obtained by reformulating 
the DMRG in terms of a variational procedure on the class of MPS with periodic 
boundaries [9]. This is done at the expense of no longer using sparse matrices, thus 
reflecting in a considerable slowdown of the procedure; moreover, due to the cyclic 
structure of the MPS, contractions of the various matrices also become more costly. 
However, from a conceptual point of view, the accuracy should drastically increase, since 
the original DMRG was equivalent to an optimization method in which the variational 
class of states intrinsically belonged to the MPS with open boundaries. Clever strategies 
for reducing the computational cost of such algorithms, in the case of translational 
invariant MPS, have been developed in Refs. [TQl Ell E2] and tested using Heisenberg 
and Ising spin- 1/2 chains as benchmark models. Very recently, an improved version of 
the algorithm in Ref . [9] , for generic non-translationally invariant systems, has been put 
forward by Pippan et al. [13] . As we will outline in the following, this enables a great 
computational speedup, so that precisions of the same order of magnitude of the ones 
obtained with open boundaries can be reached with a reasonable computational cost. 

The purpose of this paper is twofold. First we present in detail an optimized 
variational MPS algorithm for periodic boundary conditions, which enables one to study 
static properties of the ground state of large, generically non-homogeneous, systems on 
a ring (made up of N ~ 10 2 sites). The most important steps of the algorithm are those 
of Ref. [13J, we present however a number of suggestions that improve its stability. 
Specifically, first we provide a way to stabilize the generalized eigenvalue problem that 
has to be solved at each variational step. This consists in i) suitably choosing the gauge 
conditions for the MPS, as previously hinted in Ref. [H], and then in ii) wiping out the 
kernel of the effective norm operator by introducing small perturbative corrections in 
the effective operators. Furthermore, we also employ the decomposition of products of 
Ref. [13J, other than for multiplying transfer matrices, also to simplify the decomposition 
of the effective Hamiltonian, thus speeding up its diagonalization. The capabilities of the 
algorithm and the scaling of the precision with the dimension of the MPS are the same 
as those in the standard DMRG algorithm. Putting emphasis on all the major technical 
details of our algorithm, we aim at providing a rather comprehensive explanation which 
reveals itself compulsory for everybody who wants to implement it operatively. 
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The second purpose of the paper is to employ the algorithm in order to numerically 
evaluate the stiffness in one-dimensional models. Its evaluation inherently requires 
periodic boundaries and high degree of accuracy in the determination of energies, which 
can be reached within our approach and with a relatively small computational effort. 

The paper is organized as follows. In Sec. [2] we describe our variational MPS 
algorithm valid for finding the ground state of ID lattice systems on a closed ring, 
discussing the strategies we adopted in order to make it reliable and efficient (see 
Sec. 12 .4p . In Sec. |3]we provide a non trivial application of the scheme. In particular, 
we take advantage of the boundaries in order to compute the spin susceptibility in a 
ID spin-1/2 Heisenberg chain, thus locating the critical phase of the system. Finally, in 
Sec. H]we draw our conclusions and outline some possible lines of investigation. 

2. MPS variational method for periodic boundary conditions 

In this section we give a detailed overview of the numerical method employed in the 
simulations, with emphasis on the expedients we adopted in order to achieve the highest 
possible degree of stability and accuracy, while keeping the computational cost of the 
overall algorithm at minimum (see Sec. I2.4p . 

The ground state of a generic non translationally invariant one-dimensional (ID) 
quantum system on a lattice of N sites can be well approximated by a Matrix Product 
State (MPS) of the form: 

\rj>) = MA mi -A^.....A^)\H) l \t2) 2 ..-\^) N , (1) 

where \ik)k denote the d basis states we selected to describe the k-th site of the system 

(for the sake of clarity we take d equal on all sites, without losing in generality), while 

A [k],i h 

is a set of d matrices of dimension m x m (m is usually referred to as the bond 
link of the MPS), and where "■" represents the standard row-by-column matrix product 
(see, e.g., Ref. [15J). The accuracy of such ansatz generally improves when increasing 
m (as a matter of fact, any state of iV sites admits an explicit representation of the 
form with m ~ d N ^ 2 ). It turns out that, due to area law [16], for ground states of 
generic ID systems, accurate descriptions (even in the critical regime, where area law 
is violated, but only through an additional logarithmic term in the system size) can be 
obtained with a bond link dimension m which does not scale with the system length. 
Such a representation therefore drastically reduces the amount of needed resources 
from exponential (~ d ) to polynomial growth (~ Ndm 2 ) with the system size N, 
thus making simulations feasible. It is also worth stressing that, for each l^), the 
representation (prj) is not unique: indeed the rhs of such expression is left invariant when 
replacing the A^' %k matrices according to 

j\[k],ik _^ A'[%ik — y[k] . j^[k],ik . jy[&ffil] ? (2) 

where "©" stands for the sum modulus N, and where {V^, V^ 2 \ • • ■ , V^} and 
{W^\ W [2 \ ■ • • , W [N] } represent two sets of (non necessarily quadratic) matrices which 
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fulfill the isometric condition ■ = X, with X being the m x m identity matrix. 
The freedom implied by Eq. fl2]) can be fixed by imposing proper gauge conditions to the 
matrix elements entering Eq. (CQ) which, in some cases, happen to be helpful in speeding 
up the numerical search of the optimal ansatz state. 



2.1. Density Matrix Renormalization Group approach 

It has been proven that any real-space renormalization procedure results in an MPS 
structure [6J |9], therefore the best approximation to the ground state in an MPS-like 
form of a given Hamiltonian % is obtained through a variational principle, that is by 
minimizing the energy function E := £/M = with respect to all the 

MPS of the type in Eq. (JTJ. 

In the following we focus on ID lattice Hamiltonians of finite size N, of the type: 

TV 

^EE^®4. (3) 

k=l a,/3 

where a% denote some operator on site k (e.g., the spin operators with a = x,y,z for a 
quantum spin lattice system), while h^*'^ are the coupling strengths. Open Boundary 
Conditions (OBC) are set by imposing cr^ +1 = 0, while Periodic Boundary Conditions 
(PBC) can be fulfilled by requiring o~% +l = erf. The generalization to on-site and short- 
range interactions other than nearest-neighbor can be framed in the picture we are going 
to elucidate, even if in the following we will not explicitly discuss them. 

For any operator on site k, let us define the so called transfer matrix E^[ak] of 
dimensions m? x m 2 : 

d 

E [k] [v k ] = Y,^>^{A [k ^)*®A^\ (4) 
t,t'=i 

where (•)* denotes the complex conjugate. In this way we can write the expectation 
value £ = {ip[H\ip) of Hamiltonian Q on a generic MPS state (CQ) as 

TV 

s = E E • • • • • Elk] ^ • E[k+ Vk + i] ■ E[h+2] ■■■■■ E[N] ) ( 5 ) 

fe=l a,P 

where we adopted the abbreviations E^lfc] := E^ to describe the transfer matrix of 
the identity operator and (■ ■ •) := Tr[- ■ •] to represent the trace operation. A similar 
expression holds for the norm M = (ip\ip) of the MPS, which is written as a simple 
product of transfer matrices E^ on the identity matrix, i.e., J\f = (E^ ■ E^ ■ . . . ■ E^). 

This equation elucidates the dependence of the energy £ on the matrices A^' 1 , and 
in principle it can be used to determine them following a variational technique that 
minimizes £. For generic non translationally invariant systems, as we are supposing 
from the beginning, such a minimization problem would contain a huge number of 
parameters, that is of the order ~ 0(Ndm 2 ), and would be operatively intractable. 
In practice, standard optimization techniques, like the Density Matrix Renormalization 
Group (DMRG) approach, adopt clever schemes to achieve the minimization. The key 
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point of the algorithm is the following: one sequentially minimizes the energy with 
respect to the A's by fixing all of them except the ones on a given site k (rigorously 
speaking, the standard celebrated DMRG algorithm optimizes two sites at the same 
time [4j. As a matter of fact, in particularly adverse cases single-site optimizations 
may get stuck in local minima or converge much slowly, due to the suppression of 
fluctuations between the system and the environment [17J). As it is apparent from 
Eq. 05]), and the analogous equation for J\f, the dependence on A^< 1 is only quadratic. 
The minimization of energy thus consists of minimizing a quadratic polynomial E 
with quadratic constraints A/", which operatively translates into solving a generalized 
eigenvalue equation of the form: 

M k x k = XN k x k . (6) 

Here we have formally mapped the unknown coefficients of the m x m matrices A^> 1 
(with % = 1, . . . ,d), on a column vector x k of dimensions dm 2 , while the dm 2 x dm 2 
matrices M. k and N k (which hereafter will be referred to respectively as the "effective 
Hamiltonian" and the "effective norm operator") can be straightforwardly determined 
by employing the same mapping on Eq. (jSJ) and analogous for N '. In particular we notice 
that Nfc can be always written as a tensor product of a m 2 x m 2 matrix M. k times a d x d 
identity matrix / associated with the index % of A^' 1 |14j . i.e., 

N k := M k ® I . (7) 

It is also worth stressing that, by calling \ip(x k )) the many body MPS state ([T]) obtained 
by contracting matrix A^> 1 corresponding to the vector x k with all the other matrices 
at sites different from k, the matrices H^N^ satisfy the following identities, 

(tP{y k )\n\ij{x k )) = ylM k x k , (if>(ykM(x k ))=yt^kX k , (8) 

for all choices of the column vectors x k and y k . Accordingly it follows that Hfc,Nfc are 
Hermitian, that is non negative, and that the kernel of (if any) is always included 
in the kernel of M k , i.e., kern(Nfe) C kern(Hfe). The last inclusion follows from the fact 
that, if x k e kern(Nfc), then the associated many body vector state \ip(x k )) is identically 
null (indeed, according to the second expression of Eq. (jSJ), it has null norm). Therefore, 
from the first identity of Eq. ([8]) we have that y k M k x k = {ip{y k )\H\il)(x k )) = for each 
column vector y k . This implies that M k x k nullifies, proving that x k is indeed an element 
in the kernel of M k . 

Once the matrices A^ ,% associated with the fc-th site have been optimized, the 
next step consists in minimizing with respect to A^ k+1 ^ 1 and so on, until the rightmost 
site has been reached. One then proceeds analogously going leftward from site 
N to site 1, that is, one sweeps through the spins from left to right and vice- versa, 
determining at each step the matrices associated with a particular site. This variational 
procedure eventually converges to a minimum of the energ}|||. It turns out that, for open 

| It is worth noticing however that, as common in these numerical optimization strategies, there is 
no formal proof that the reached minimum will be an absolute one (indeed it is possible that it will be 
only a local minimum). Of course one could improve the confidence of the procedure by performing 
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boundary conditions (OBC), the problem can be considerably simplified by exploiting 
the freedom (j2J) to impose suitable gauge conditions on the matrices A^' % so that, at 
each step, the state is always normalized and can be kept equal to the identity 
matrix. Equation (J6]) would then become a standard eigenvalue equation of the type: 

M. k x k = Xx k . (9) 

To verify this, we first notice that since the first and the last matrix entering the MPS 
expression ([I]) need not to be directly connected via a dedicated bound link, one can 
assume they are expressible as |/l^' l L» = a e> \ [^ N ^ l ]ii' = ^'i 1 ' wr th 5^ 
being the Kronecker delta. The gauge conditions for the optimization on site k consist 
then in choosing a so called "left isometry condition" for all the matrices on the left 
of k: 

d 

J2(A^y ■ = 1 , Vj = l,...,fc-1, (10) 

i=l 

and a "right isometry condition" for matrices on the right of k: 

d 

^A^- (A^Y =1 Vj = k + l,...,N, (11) 

i=i 

with I being the mxm identity matrix. In this way M becomes a constant (i.e., M = m) 
which can be trivially absorbed in the definition of H^. 

Going from left to right, condition ( JTUI) on the optimized matrices A^' 1 is enforced 
by performing a Singular Value Decomposition (SVD) on the matrix Vi a ,p = A^£ 
(a, f3 = 1, . . . ,m are row and column indices of the matrix A^' 1 ; the site index i has 
been grouped into the row index of the matrix V - in this way Eq. (TTUl) equals to say 
V'V = X) such to obtain V = UDW with U, W isometries and D > diagonal matrix. 
One then discards DW and simply substitutes the matrices with A'^ 1 = Ui a ^. 
Going from right to left, one does a similar procedure after grouping site index i into 
the columns of the V matrix: V a ,ip = A^£ (so that Eq. (fTUl) translates into VV^ = X), 
and performing a SVD on V. The matrices A^ ,% are then changed with A'^h 1 = W a ^p. 

2.2. Constructing the effective Hamiltonian 

The core of the DMRG algorithm consists in the iterative resolution of Eq. (Q in the 
case of OBC or of Eq. for PBC. Nonetheless, even the construction of the effective 
Hamiltonian Hfc (and of the effective norm operator for PBC) may constitute a 
bottleneck. As a matter of fact, it turns out that, for OBC, the number of operations 
that are needed to construct is relatively small and scales as m 3 . 

In order to achieve such a goal, at every step of the variational algorithm it is 
convenient to store some operators that are products of transfer matrices. Let us suppose 

optimization iterations that work on the optimization of the tensors pertaining to (say) a couple of 
consecutive sites. 
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we are optimizing the k-th site. In this case we may want to save the matrices: 

ji<m] £[1] . E [2] . . E [m-1\ 

B l<m] . = E [l] . E [2) . . £ [m -l] ( (12) 

ft[<m] ;= £m-2 ^[i] . _ . . ^[J+II^J ..... £[m-l] ; 

for each m < k and analogous ones for every m > k. In this way one has 

S = (H [<k] ■ E [k] ■ T [>k] ) + (T [<k] ■ E [k] ■ U [>k] ) (13) 

+ E ( r[<k] ■ E[k] ^ ■ B t k] + ^ <fcl • Ei V k ] ■ r [>k] ) , 

a,/3 

J\f = (T [<k] ■ E [k] ■ T [>k] ) , (14) 

so that the effective Hamiltonian and the norm operator can be explicitly 
constructed. Quite remarkably, one can see that, when going to the next iterative step 
at site k + 1, the corresponding matrices f^ <k+l \ B^ k+1 \ ^ <k+1 ^ can be built up from 
the corresponding ones with indexes ' <fc ' by multiplying them by the transfer matrix 
corresponding to site k and by properly adding the result. In the case of OBC, each one 
of these passages requires one to perform a number of fundamental operations which 
scales as m 3 . Besides that, since the A matrices of Eq. (TjQ) are chosen such to fulfill an 
isometry condition of the type (fit)]) (left of k) or (TTTj) (right of k), the operators T are 
trivial and correspond to the identity. This implies that the effective Hamiltonian ( ITBl is 
a sparse matrix, thus dramatically speeding up the resolution of the eigenvalue problem 
in Eq. © i 

With PBC, two major obstacles emerge. On one hand, the resolution of a 
generalized eigenvalue problem of the type in Eq. (Q can have problems if the matrix 
Nfc is ill-conditioned, while one in general can no longer benefit of the sparseness of 
matrix M./,. On the other hand, the number of operations required to build up and 
Nfc is considerably larger, due to the absence of boundaries from which performing the 
contractions. It turns out that each of the basic contractions that are needed, that is 
the multiplication of a transfer matrix of dimensions m 2 x m 2 with a product of transfer 
matrices of the same dimensions, require 0(m 5 ) operations. This drastically limits the 
capabilities of the algorithm to very small sizes (N ~ 20 -r- 30) with a low bond link [9]. 

2.3. Truncated SVD 

Remarkably, as discussed in Ref. [T3] . from a computational point of view the second 
obstacle can be overcome by using the following observation: if one performs a SVD 
decomposition of a sufficiently long product of m 2 x m 2 transfer matrices (fc > 1), the 

§ Since only the eigenstate corresponding to the smallest eigenvalue is needed, powerful numerical 
methods such as Davidson or Lanczos techniques can be suitably used. Unlike brute-force 
diagonalization approaches, these methods also take advantage of the sparseness of the matrix, requiring 
only a matrix-vector multiplication routine. 
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singular values in general will decay fast, i.e.: 

£[l].£[2].....£M~^ S . UjV T ; (15) 

where p <C m 2 (m 2 being the total number of singular values of the product in the 
lhs), while Uj and V 3 - respectively denote a left and a right singular vector of size m 2 . 
Intuitively Eq. ( fl5|) can be justified by the fact that, in the limit of large N, the local 
physics of the system should not really be affected by the properties of the boundaries: 
consequently considering that, for OBC, imposing the gauge conditions (TIP]) . ( TIT]) is 
formally equivalent to enforce exactly the structure of Eq. ffT5|) with p — 1, one thus 
expects that, for PBC, the rhs of Eq. ( fl5i) should constitute a good approximation of 
the lhs. 

Therefore, if one knows a priori that p <C m 2 , he can evaluate products of transfer 
matrices of the type in Eq. (1121) by performing a truncated SVD which enables the 
calculation of only the largest p singular values out of the m 2 possible outcomes. Such 
an operation requires only 0(p x m 3 ), that is a m 2 factor less than the standard SVD. 
However we verified that, in order to achieve good accuracies for the sizes we have 
considered, in practice one has to choose a value of p which scales approximately linearly 
with m: p ~ m. This limits the actual gain of the overall operation by a factor of m. [jj| 
The method is explained below in the Sec. 12.3.11 Once the product of k transfer matrices 
has been written in the form of Eq. ( fl5|) . multiplying it with another matrix on 
the right of it such to obtain 7~' >A;+1 1 is relatively easy and also requires 0(p x m 3 ) 
operations, involving only the p vectors Yj [similarly, multiplying it on the left by a 
transfer matrix requires acting only on Uj with 0(p x m 3 ) operations]. 



2.3.1. Operative algorithm for the truncated SVD The SVD of a generic matrix 
M G A^ m 2 xm 2 generally requires 0(m 6 ) elementary operations [more precisely, taking 
into account the tensorial product structure of the transfer matrices, in this specific 
case 0(m 5 ) operations are needed]. If one is only interested in the contribution coming 
from the p largest singular values of M, one can employ a truncated SVD according 



to the following prescription (see, e.g., Ref. [13] and Appendix A for details). Taking 
advantage of the tensor product structure of the transfer matrices, this requires only 
0(p x m 3 ) operations. 

• Generate a random matrix x G M. p xm 2 of full rank; 

• Multiply x with the input matrix M on the right: y = xM; 

• Orthonormalize the rows of y by using a Gram-Schmidt decomposition, so to 
construct a matrix y' G -M pxm 2; 

| In Ref. [12] a gain of a factor m 2 over the plain algorithm is claimed, with the truncated SVD 
strategy. We point out that, within our numerical experience, we could reach adequate precisions for 
our measures only by scaling p linearly with m (see, e.g., data in Fig. 2]). This in practice would reduce 
the computational effort only from m 5 to m 4 . 
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• Take the transpose conjugate of y' and multiply it with the input matrix M on 
the left: z = M(y'Y; 

• Perform a SVD of such obtained z G Ai m 2 X p matrix, and write it as z = UDV; 

• Evaluate V = V'y', so that M can be written in SVD form as M = UDV. 

2. 4- Stabilization of the generalized eigenvalue problem 

Let us now come to the core of the optimization procedure, once the effective 
Hamiltonian and the effective norm are built up. We will discuss this point by presenting 
few "tricks" that we found useful to implement, in order to enhance the stability of the 
algorithm. 

As it has been mentioned before, the generalized eigenvalue problem in Eq. ([6]) is 
typically harder to solve than the standard one (Q. Firstly, from a technical point of 
view, when using a periodic structure for the MPS, it involves non-sparse matrices H*. 
and Nfc. This is because of the lack of a starting point (the two outermost sites of 
the chain, for OBC) from which the left (fTUI) and right (TTTT) isometry conditions could 
be recursively applied. As a consequence, with PBC the pure transfer matrices T can 
no longer be written as identities, and operators (fT3]) - (ri4j) are generally mapped into 
two non-sparse dm 2 x dm 2 matrices, thus inevitably slowing down the efficiency of the 
diagonalization procedure. A second point, more of conceptual nature, is related to the 
conditionability of the problem. Equation ([9]) is well conditioned, provided the spectrum 
of matrix Hfc is bounded. This is not sufficient for Eq. fl6]), where one also necessarily 
has to ensure that the matrix is strictly positive. The emergence of non trivial kernel 
spaces for indeed introduces a critical instability in the diagonalization algorithms, 
which are usually based on convergence criterions of the type 1 1 HI fc £ fe — A£fc 1 1 / 1 1 1 1 < \i , 
with fi small parameter controlling the convergence to the solution (typical values are 

/' < in '")• 

To cope with this problem, we found it convenient to wipe out the kernel of 
by forcedly adding a small correction — > Nk(e) := + eX and M.k — > Mk(s) := 
Hfc + y/sZ- Indeed, considering that the kernel of is included into the kernel of H^, 
it follows that the generalized eigenvalues associated with the matrices Hfc(e), Nfc(e) 
form two sets: the first is composed by elements A(e) — A + 0(y/e), with A being 
the generalized eigenvalues of the couple H^Nfc; the second instead is formed by terms 
which scale as Therefore, for e — > 0, the minimum value of the A can be computed 

as the minimum generalized eigenvalue of Hfc(e), Nfe(e) - the only price to pay is the 
fictitious introduction of an error of order ~ 0{y/e). For practical purposes, reasonable 
values in order to remove instabilities, without substantially affecting the simulation 
outcomes, are e < 10~ 12 . 

% Fast methods for finding the low-energy spectrum of large matrices, like the widely used Davidson or 
Lanczos techniques, typically require to provide the application of the effective Hamiltonian and norm 
operator onto a generic input vector £. With OBC this enables a great computational speedup due to 
the sparseness of the matrices; with PBC such speedup unfortunately vanishes. 
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Once the kernel of N/. has been eliminated, other non-critical instabilities due to 
numerical accuracy may emerge if eigenspaces of very small (positive) eigenvalues are 
present. As pointed out in Refs. P [H], it is generally very helpful, for an accurate 
numerical convergence of the code, to take advantage of the gauge arbitrariness and 
redefine the matrices and f% in such a way as to increase the smallest eigenvalue of 
N k as much as possible. To this aim it is useful first to perform an approximate SVD 
on the non trivial m 2 x m 2 component of f%, i.e. the matrix of Eq. ([7]). Using the 
recipe of Sec. 12.3.11 we thus write 

M fc «^£f* Uf* (Vf fc ) T , (16) 
j 

where Sf fe , £jf fc , ■ • • , are the singular values of the matrix in decreasing order. 
Here we remark that, if we used OBC, at any point k, the left and the right side of 
the lattice would have become factorized so that was a tensor product [besides, 
using suitable gauges expressed by the left and right isometry conditions (fTOl) -(TTTT) . it 
would have also been possible to write it as a tensor product of identities]. On the 
contrary, PBC originate unavoidable correlations between the two sides. Nonetheless, 
for sufficiently long systems, these correlations can be reasonably considered small, so 
that M k is effectively close to a tensor product. This implies that, in Eq. ( !T6|) . the 
largest singular value H 1 k is by far greater than all the others. 

A clever gauge on the MPS structure can now be imposed such that the leading 
factorized term associated with the decomposition of Eq. f fT6|) gets close to the 
identity operator. It can be constructed by taking the m x m matrices [iVi]^ : = 
(E M fc) _ 1/2 [0 M fc] ^ and [N ^ g . = (E M fc) _ 1/2 [Y f% 5 , where £f*,Uf*, and Vf fc define 

the leading contribution of Eq. (fl6l) . The m x m (invertible) gauge matrix X for the 



left side of the system can then be constructed by solving the equation XNiX^ = X (sua 
analogous matrix Y, which satisfies F^Y^ = X, is built up for the right side) 0. Such 
gauge maps the eigenvalue equation ([6]) into 

M' k x' k = XN' k x' k , (17) 

with 

M' k = (X <g> Y) H fe (2) y+) , (18) 

N' k = (X <8) Y) N k (X f (8) F f ) , (19) 



As it is apparent from Eq. (Tl9l . the new effective norm operator N' k satisfies the property 
we required at the beginning, since the leading term has been transformed into the 

+ This equation can be formally solved by noting that Ni = X^(X^)- 1 = (X^X)- 1 so that 

X^X = TV-f 1 and therefore X — yj N± l . A solution is then obtained by taking the inverse square 
root of Ni, if this last matrix is invertible. In case of a non- null kernel of N\, the Moore-Penrose 

pseudoinverse V Ni can be considered. 
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Figure 1. Stabilization scheme for the generalized eigenvalue problem. A suitable 
gauge is provided by the invertible matrices X (Y) (the indexes ~ indicate inverse 
matrices, so that product matrices of the type highlighted in the oval blob exactly equal 
the identity matrix). All the left and right operators (Lj and Ry) that are needed to 
build up H/j and N& [see Eqs. (fT3f and (TH|) respectively] have to be contracted and 
transformed according to Eqs. (IT^ - fllT)!) . Consequently, the input/output eigenvectors 
(dashed lines and boxes) are transformed following Eq. PU|) . The black dot connecting 
input and output matrices stands for the physical operators of the Hamiltonian at site 
k that have to be accounted for, in some terms of Eq. (flU)) . 

identity operator 0- The application of the gauge, mapping the original eigenvalue 
problem ([6]) into (TT7T) . can be operatively implemented by multiplying all the left (right )- 
side operators that are needed in order to build up and Nk [for an explicit expression, 
see Eqs. (ITB1) ] by matrix X (Y), as graphically shown in Fig. [TJ Once Eq. (117)) is solved, 
one has to keep in mind that also the solution x' k is gauged with the same matrices. 

Finally we notice that the resolution of Eq. f[T7j) can be considerably speeded up 
by approximating H' fc along the line of what already done in Sec. 12.3.11 Explicitly, 
we expanded W k via SVD and kept only the contributions associated with its s 
largest eigenvalues [typically s can be kept of the same order of the parameter p of 
Eq. (fit)]) ]- By doing so, the number of fundamental operations needed to solve Eq. (fITj) 
through standard large-matrix eigenvalues solvers, such as the Lanczos method, can be 
drastically reduced. In practice, by operating a cut in the SVD representation of W k , 
one substantially improves the efficiency of the matrix-vector multiplication routine: 
y' k = M.' k x' k [where x' k , y' k are generic input/output (d x m 2 )-dimensional vectors]. This 
is crucial, since for matrix dimensions of the order of the ones considered in our MPS 
simulations, the routine is typically called O(10 2 -=-10 3 ) times, and therefore its repeated 
iteration constitutes the actual bottleneck of the resolution of Eq. (JTTj). 

More in the specific, we have to stress that the scaling of the contractions needed 
to perform the Hamiltonian-vector multiplication is not substantially modified, going 

* We remark that the gauge transformation presented here is not related with the one used in Ref. [T3] ■ 
As a matter of fact, in our simulations we implemented both of them, but found that stabilization is 
better achieved with the one discussed in the text. 
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from 0(px d x m 3 ) to 0(s x d x m 3 ). Nonetheless, what really changes is the prefactor 
of these scalings. This can be quite large (~ 20) in the first case, accounting for the fact 
that the effective Hamiltonian is made up of various terms coming from the fictitious 
division in sectors of the system (see Sec. I2.5l for details) and their interconnections, and 
also from the requirement of site k to be kept free in order to employ the variational 
optimization. On the other side, in our procedure the prefactor basically equals the 
unit, being it the fastest way to perform the multiplication, while essentially keeping 
the same Hamiltonian structure. We carefully checked that this procedure does not 
notably affect the accuracy of the operation. 

2.5. Optimization algorithm 

A single-site minimization procedure which is very convenient for ID systems with PBC 
is the circular scheme proposed in Ref. [13], and here depicted in Fig. |2j Basically, 
the optimization proceeds following a circular pattern, rather than using the standard 
backward and forward pattern that is generally used for OBC. 

In order to guarantee that the number of transfer matrices that have to be multiplied 
such to form Eq. ffT5j) is sufficiently large, one divides the circular ring into three 
concatenated sectors of spins (reasonable values for the global system size are N > 10 2 ). 
Optimization then starts from one section, say the first one, and proceeds along a fixed 
direction, say counterclockwise from site 1 to site N/3. Before initiating it, one has to 
construct the partial effective Hamiltonian and all the required operators corresponding 
to the two other sectors (second sector from site N/3 + 1 to site 2N/3, and third 
sector from site 2iV/3 + 1 to site N) in a SVD-like fashion, following the prescription 
of Sec. 12.3.11 Then a set of operators for the optimizing section can be constructed by 
successively adding transfer matrices from the rightmost part of the optimizing section 
to the left of the operators constructed for the section on the right. After this, the 
normal optimization procedure can go on, until the border of the optimizing section has 
been reached, say site N/3. At that point, the procedure repeats from the beginning, 
with the section on the right (from site N/3 + 1 to site 2N/3) as the new optimizing 
section. And so on, in a circular way. A pictorial representation of the scheme discussed 
here is given in Fig. [2j 

Here we stress that, for each optimization step, once all the required products of 
transfer matrices are constructed, of course operators relative to the three sectors have 
to be contracted in a circle (see lines connecting blocks in Fig. [3J so to build up the 
effective Hamiltonian Hfc and the effective norm operator for a given site k. This 
operation can be performed fast, due to the truncation in the SVD of the left and 
right sectors. Then one proceeds with the stabilization of the generalized eigenvalue 
problem ((6]), following Sec. I2.4[ before solving it. Its solution eventually provides the 
optimized entries of the matrices A™ in the MPS state at position k. 

In the following we will use for the first time the proposed MPS algorithm for PBC 
in order to compute the energy response in a spin ring, when subjected to a change in 
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Figure 2. Circular optimization algorithm: the lattice ring of length N is divided 
into three sectors (S). The single optimization sweep proceeds from site 1 to site N. 
When a new sector is entered (S2 in the example given in the figure), one builds up 
the product of transfer matrices for the other two sectors (SI and S3). At the k- 
th optimization step, before performing the minimization in terms of the generalized 
eigenvalue problem ([S]), in order to build up the effective matrices and one 
contracts the transfer matrices in the optimizing sector (S2) to the left (right) of the 
optimizing site with the left (right) truncated SVD, following the order given by the 
arrows. 



the boundary conditions (twisted boundary conditions). This permits to quantify the 
spin stiffness (or analogously the superfluid density in bosonic models). Hereafter we 
will only be interested in the ground state energies, while expectation values of operators 
will not be taken into account. This will enable us to obtain reliable results also using 
MPS descriptions with a relatively small bond link m ~ 20. 



3. Spin stiffness in the Heisenberg chain 

We consider here the spin- 1/2 Heisenberg model, that is described by the Hamiltonian: 



-(S+Sj- +l + h.c.)+AS*S: 



(21) 
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Figure 3. At each optimization step, the matrices and are built up by 
contracting the transfer matrices of the three sectors (each line connecting the various 
blocks indicates a contraction). Due to the truncation in the SVD of the left and 
right sectors, links and are made up of only p elements. This makes it 
possible to considerably reduce the computational time for the contractions. The 
two circles connecting the left and right operators U and V of the left and right 
sectors stand for diagonal matrices corresponding to the singular values of the two 
SVD, thus implying j = j' and 1 = 1'. Expliciting the index mapping which leads 
to the eigenvalue Eq. (J6]), we get the following component- by-component equation: 
[Hfe]^'^ [sGfc]J£, = A [Nkfxp^v [xk]]f v , where the eigenvector corresponding to minimum 
eigenvalue [iCk]m, := (A^> lk ) 



where a" = 2Sf (a = x, y, z) denote the spin- 1/2 Pauli matrices on site j, = S x ±iS y 
are the raising/lowering spin operators, J is the coupling strength and A is the 
anisotropy. Hereafter we will set J = 1 as energy scale, and use units of h = = 1. 

In the thermodynamic limit and at zero temperature, this model exhibits a gapless 
critical phase for | A| < 1 with quasi-long-range order emerging from power-law decaying 
correlation functions, while it is gapped and short-ranged for |A| > 1. The gapless 
phase is characterized by ballistic transport, which corresponds to a finite spin stiffness 
p s (A) at the thermodynamic limit [IE]. This quantity expresses the sensitiveness to a 
magnetic flux added in the system when PBC are assumed, and can be formally treated 
by performing a twist in the boundary conditions. Namely, the addition of a flux <f 
along the z direction corresponds to take the twisted boundary conditions 

C± C± p ±i4> OZ _ CZ fooN 

°N+1 — °1 e J °N+1 — °1 > 

where N denotes the length of the ring. The stiffness is then defined as follows: 



Ps ■= N- 



(f>=0 



where E (<j)) is the ground state energy with a magnetic flux of intensity <f. Here we 
remark that p s evaluated with OBC is strictly zero, since with open boundary geometry 
one can always cancel the effect of the twist by applying suitable gauge conditions to 
the spins. 
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Figure 4. Convergence to the ground-state energy of a spin- 1/2 Heisenberg chain 
of length N = 150 and anisotropy A = 0.5 with twisted boundary conditions, as a 
function of the variational steps in the MPS optimization algorithm. Here we used a 
bond link m = 18, with truncation indexes p = 50 and s = 35; we imposed a random 
initial guess and performed 80 sweeps (energies are rescaled over the ground state 
energy E with PBC, so that E* = E — E + 5x 10~ 5 . For the sake of clarity, we used 
a logarithmic scale on the y axis). 



By solving the Bethe ansatz equations for model ( 121]) . one can show |T8] that, at 
the thermodynamic limit and for |A| < 1, 

. . . 7r sinful . . , , 

p,(A) = J w with A = cos /i , 24 
4/i(vr - fi) 

while p s (A) vanishes for | A| > 1. The point | A| = 1 is characterized by a metal- insulator 
transition between a metallic phase with a finite p s and a gapped insulating regime with 
ps = 0, following a Mott mechanism. 

We employ the MPS algorithm with PBC described above in order to evaluate 
the stiffness. In Fig. H] we show the actual system energy at each step of the MPS 
algorithm. As one can notice, since the algorithm is variational in the energy of the 
system, the leading behavior of the curve is monotonic decreasing. When the energy 
is close to its convergence point, fluctuations due to the truncation in the SVD process 
become evident % Anyway, provided the two truncation parameters p and s are chosen 

jt To be more quantitative on this point, we need to address the effects in the convergence of the 
MPS algorithm with the two truncation parameters p and s. Typically, increasing the truncations also 
increases the fluctuations in the converged energy. This is quite different from the energy behavior by 
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Figure 5. Ground-state energy of a Heisenberg chain of length N — 150, A = 0.5, 
as a function of the twist <j>, obtained by averaging the data in Fig. |4] over the last 
five sweeps. The red curve is a quadratic fit of numerical data (black circles), with 
c 2 fa 1.03435 x 10~ 3 . 



sufficiently large, as it is the case in Fig. HI an average over the last sweeps is in general 
sufficient to sweep away all the unwanted fluctuations. This clearly emerges from the 
averaged converged values of the energy, plotted in Fig. [5] as a function of the twist 
angle <fi for twisted boundary conditions. Indeed a neat quadratic behavior is visible. 
We then fitted the curve E ((f)) with a quadratic law: E ((fi) = E (0) + C2<p 2 (see the red 
line), obtaining the prefactor c 2 which is directly related to the stiffness: 

p s = 2Nc 2 . (25) 

In the figure we used a bond link m = 18, which is considerably smaller than the 
one used in actual simulations performed for OBC (for well optimized codes, m can 
typically reach values one order of magnitude larger). Nonetheless we should also point 
out that typically large numbers of m are required for evaluating local observables or 
correlation functions; on the contrary, here we are only interested in the ground state 
energies for which large values of m are less crucial. As an example, in the case of the 
isotropic Heisenberg chain, with m = 18 we found E Q /N —0.443138 which is still 
not far from the thermodynamic limit value obtained from Bethe ansatz calculations 
e = - In 2 + 1/4 « -0.443147. 

A plot of the spin stiffness ( 123]) as a function of the anisotropy A, in the critical 
phase | A | < 1 is shown in Fig. [6J where data have been collected for systems of N = 180 



varying the bond link to, which strictly governs the accuracy of the algorithm. We will explain this 
point more in detail later, in Sec. 13.11 
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Figure 6. Spin stiffness in the critical phase of the Heisenberg spin- 1/2 chain, Eq. (|2Tj) 
with |A| < 1. Symbols are obtained from numerical simulations of a system with 
N = 180 sites, for different values of the bond link m (and, accordingly, different 
values of p and s, see the legend). The straight curve is the analytic estimate as 
obtained from the Bethe ansatz, Eq. ([24)) . 



sites. We obtained ps as a response in the ground-state energy to a twist in the BCs, 
using Eq. (1251) . As it is apparent from the figure, modest values of m are sufficient 
to attain good accuracies for p s : for values of the anisotropy sufficiently far from the 
isotropic limit, at m — 18 we are able to reach precisions in p s of the order ~ O(10~ 5 ). 
In particular, the convergence to the exact value at the thermodynamic limit appears 
to be approximately power-law in 1/m, with an exponent ~ 2.5 as explicitly shown 
in Fig. We also noted that the dependence of the stiffness on the system size N is 
negligible on the scale of Fig. |6] for N > 10 2 , even if they can become relevant in a 
comparison with the exact value at the thermodynamic limit, as it is done in Fig. [7J 

Special care has to be taken in the region close to the border of the critical 
zone A ~ 1: this requires higher precisions. Moreover, here the convergence of 
the minimization algorithm becomes much slower (more than one hundred of sweeps 
are required in order to reach the minimum of energy). The reason is due to the 
antiferromagnetic character of the Hamiltonian in this regime [12] ; this problem could 
be at least partly overcome by performing a minimization algorithm similar to the one 
proposed here, but which optimizes two sites at the same time. 
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Figure 7. Absolute differences \Sp s \ between the stiffness in the Heisenberg model 
with N = 180 sites, evaluated with a given bond link dimension m, and the exact 
value at the thermodynamic limit, as given by Eq. (|24[) . The continuous red line 
denotes a power-law fit of data \5p s \ ~ m -256 . Data shown are for A = 0. 

3.1. Convergence with the truncation parameters 

We now discuss the stability of the algorithm with respect to the various approximations 
introduced for speeding up the PBC problem. Our MPS algorithm for PBC is governed 
by three control parameters: the size m of the matrices in the MPS representation ([T]), 
the two truncation parameters on the singular values of long products of transfer 
matrices (p) and on the effective Hamiltonian (s). As we already discussed before 
and as it is apparent from Fig. [7J the bond link controls the global accuracy of the 
algorithm. This is analogous to the standard DMRG algorithms, where an increase of 
m typically produces a converged value of the ground state energy which monotonically 
decreases towards the exact value. In our case, we found an approximately power-law 
convergence with m of the stiffness p s . 

Let us concentrate on the effects of the truncations. We first discuss the convergence 
of the algorithm with p. This parameter expresses the number of singular values that are 
kept in the SVD representation of the product of transfer matrices belonging to sectors 
other than the one on which the optimization is running on (see Sec. 12.51 and Fig. |2]), 
and it ranges in the interval [1, m 2 }. Only in the case p = m 2 no errors are introduced in 
the SVD representation; any p < m 2 will introduce further errors in the algorithm. As 
it is clearly visible from Fig. [HI where in the different panels we show the convergence 
of the ground state energy for different values of p fixing m and s, a value of p < m 2 
introduces non-monotonic fluctuations in correspondence of any change of sector, during 
the optimization algorithm (therefore with periodicity 1/3, in units of the number of 
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Figure 8. Convergence of the MPS periodic optimization algorithm, as a function 
of the truncation parameter p. The different panels display the ground-state energy 
per-spin of a spin-1/2 Heisenberg chain of length N — 150 and anisotropy A = 0.5 
with twisted boundary conditions, as a function of the variational steps. Here we 
fixed the bond link m = 18, and the index s — 35 (energies are rescaled such that 
E* = E + 0.375). The four panels are for p = 10, 15, 20, 30. Note that the scale on 
the y axis of the upper left, the upper right, and the two lower panels are different. 

sweeps). On the contrary, while the algorithm is running in a given sector without 
changing it, the energy is monotonically decreasing, due to the intrinsic variational 
character of the optimization. When p is increased, these fluctuations diminish, until 
they become hardly visible on the scale of the figure for p > m (note that the energy 
scale in the panels of Fig. [8] is different). For example, in the case m = 18 we obtained 
quite stable results for p > 30. We remark that for low values of p (p = 10, 15 in the two 
upper panels of Fig. E}, due to strong fluctuations, we could not even see an increase of 
the energy with the magnetic flux <f), therefore it was impossible to extract a value for 
the spin stiffness p s . 

We finally focus on the convergence with s, which quantifies the number of singular 
values kept in the SVD of the effective Hamiltonian W k . It turns out that, as it happens 
with the other truncation parameter p, by increasing s £ [1, m 2 } we found a decrease of 
the fluctuations (see Fig. [9]). However these fluctuations can be distinguished from the 
ones due to p, since they do not have a definite periodic structure and are random, as a 
function of the steps in the algorithm. This is because at each variational step one has 
to construct an effective Hamiltonian. Like for p, we found that a value s > m is able 
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Figure 9. Same as in Fig. [HI but keeping p = 35 fixed and varying the truncation 
parameter s — 15, 20, 25 30. Note that the scale on the y axis of the two upper panels, 
the lower left and the lower right panel are different. 



to reduce such fluctuations such to obtain stable values for the converged energy (in the 
example of Fig. |9]with m = 18, we noted that s > 30 is sufficient to compute p s ). 

We point out that the correct evaluation of the susceptibility with respect to (ft in 
Eq. ( l23|) needs an high degree of convergence of the energy E ((/)). As a matter of fact, 
fluctuations caused by low values of the truncation parameters may completely hide the 
small variations of E induced by infinitesimal magnetic fields (J), thus invalidating our 
procedure for the evaluation of p s (as it is visible if the upper panels of Figs. El E]). In 
conclusion, as already stated in Sec. 12. 3[ for our calculations we required p,s > m, thus 
practically worsening the computational requirements of the PBC algorithm to 0(m 4 ), 
instead of 0{m?) as it is claimed in Ref. [13]. On the other hand, in order to give a 
rather precise estimate of the ground state energy E(<f> = 0), it is probably sufficient 
to take lower values of p (as in the upper left panels of Figs. El El where the relative 
error induced by the fluctuations is already 5E/E ~ 10 -6 , with SE being the size of 
fluctuations and Eq the ground state energy). This also explains why Pippan et al. 
were able to go to larger values of m (m ~ 50) but apparently smaller values of p, in 
order to get the ground state energy of the s — 1 Heisenberg model with their PBC 
algorithm [13] 0. 

ff We did not push our simulations further in m, since already with the parameters used here we 
obtained rather accurate results, apart from the points close to A = 1. However, in a recent paper we 
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4. Conclusions 

In this paper we presented a variational procedure for numerically finding the ground 
state of a generically non-translationally invariant one-dimensional Hamiltonian model, 
which can be accurately written as a suitable MPS. This is constructed starting from 
the work in Ref. [13] . which in turn is an improvement of the variational formulation 
of DMRG given in [9]. On top of this, we elucidated some technical improvements for 
stabilizing the code, as detailed in Sec. 12.41 which also reveal useful in the precision 
measurements of energies that we performed subsequently. 

The algorithm globally scales as 0(N xpxm 3 ), where N is the system size, m is the 
size of each matrix in the MPS representation, while p is the number of singular values 
that are kept in typical SVD decompositions of transfer matrices (generally, for good 
performances one has to take p ~ m). The accuracy of the algorithm is the same as for 
the open-boundary case, once the dimension m of the matrices is fixed in both cases (it 
is therefore also the same as for the original DMRG algorithm, where m is the analogous 
of the number of states kept for describing each block). We point out that, as typically 
implemented in good DMRG codes, also in this procedure it is in principle possible 
to take advantage of some specific symmetries of the Hamiltonian under scrutiny. In 
particular, Abelian U(l) symmetries can be included [19]. Reasonably, this would admit 
a computational speedup of up to one order of magnitude, as thoroughly noticed in 
analogous benchmark MPS codes with open boundary conditions. 

We showed how to exploit the possibility to change the boundary conditions on the 
ring, in order to evaluate the response to a magnetic flux added in the system. This 
provides the so called stiffness, which acts as order parameter for the critical phase. 
As an example, we calculated the spin stiffness in the spin-1/2 Heisenberg chain and 
compared it with the analytic predictions. The quantitative analysis of the stiffness, in 
combination with a study of the solid ordering in strongly correlated systems through 
structure factor measurements, is of crucial importance in the characterization of the 
different states of matter which may arise, including elusive ones, such as the supersolid 
phase [20]. It is worth stressing that our MPS algorithm inherently accesses very large 
systems, thus directly addressing the system properties in the thermodynamic limit. 

Appendix A. Working principle of the truncated SVD 

Consider first the case in which the m 2 x m 2 matrix M admits exactly r non zero singular 
eigenvalues with r ^ p. This implies that its singular decomposition can be expressed 
as M = UDV with U G Ai m 2 XT , V G A4 rxm 2 isometries, and D = A4 rxr diagonal. Now 

2 

observe that, given z a vector of C r , Uz is a vector of C m . Define then the subspace 
21 C C m spanned by these vectors, i.e., 

21 = Sp&n{Uz : z G C r } . (A.l) 

were able to reach m = 40 with reasonable computational effort, using our MPS periodic algorithm 20 . 
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By construction it has dimension r and its orthogonal complement is the left-kernel of 
M (i.e., it is formed by the vectors of C™ 2 which nullify when we apply M on their 
right). Let us then consider a full rank random matrix x G -M pxm 2 with p < m 2 : by 

2 

construction its rows xi,x 2 , - ■ ■ ,x p will span a p-dimensional subspace 23 of C m . Since 
r < p, we can assume that 23 will have a non trivial overlap with the subspace 21 
(i.e., no non trivial subspace of the latter will be fully disconnected from the former). 
By the same token we also notice that for £ = 1, ■■■,p, the vectors y e = x e M will 

2 

in general span a subspace <£ of C m of dimension no larger than r, which with high 
probability coincides with the image of C™ 2 generated via the application of M on the 
right. Via Gram-Schmidt decomposition we construct now an orthonormal set of row 
vectors {y' e : £ — 1, • • • ,p} which include such space as a proper subspace: using these 
vectors as rows for apx m 2 matrix, we thus construct the y' matrix of the protocol. 
Let then w be a generic row vector of C r : by construction wV will belong to the € 
space and could be expressed as wV = J2e a eUt This yields the identity V = Vy'^y' 
where y' is the p x m 2 matrix whose rows are given by the vectors y' t Now define the 
m 2 x p matrix Z = My'^ and compute its SVD decomposition Z = UDV: since by 
construction we have that Zy' = My'^y' = M we can conclude that M = UDVy' which 
allows us to identify D with D, U with U and Vy' with the isometry V . 

Now consider the case in which M possess r <^ p dominant non zero singular 
eigenvalues, plus others which are negligible (i.e., they are not null as in the previous 
case, but can still be neglected when compared with the first d ones). The previous 
derivation still holds in this case: the main difference being that the approximated 
eigenvalues obtained from Z will correspond to the real ones with an error which scales 
as e (the latter being the relative magnitude of the small singular eigenvalue when 
compared with the large ones). In this respect, it is worth noticing that the procedure 
does not really produce the first p largest singular eigenvalues of M: however, admitting 
that r p, it will approximately produce the first r largest ones. 
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